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In this paper, we study incentive mechanisms for retrieving information from networked agents. Following 
rf^ ' the model in jKleinberg and Raghavan 2005] , the agents are represented as nodes in an infinite tree, which is 

^_^ , generated by a random branching process. A query is issued by the root, and each node possesses an answer 

f**^ ■ with an independent probability p = 1/n. Further, each node in the tree acts strategically to maximize 

^O ' its own payoff. In order to encourage the agents to participate in the information acquisition process, an 

incentive mechanism is needed to reward agents who provide the information as well as agents who help to 

facilitate such acquisition. 
_^ ■ We focus on designing efRcient sybil-proof incentive mechanisms, i.e., which are robust to fake identity 

^^ ' attacks. We propose a family of mechanisms, called the direct referral (DR) mechanisms, which allocate 

fyr>, most reward to the information holder as well as its direct parent (or direct referral). We show that, when 

»^ , designed properly, the direct referral mechanism is sybil-proof and efficient. In particular, we show that we 

i rnay achieve an expected cost of 0{h?) for propagating the query down h levels for any branching factor 

I i' 6 > 1. This result exponentially improves on previous work when requiring to find an answer with high 

^1^ ' probability. When the underlying network is a deterministic chain, our mechanism is optimal under some 

O, mild assumptions. In addition, due to its simple reward structure, the DR mechanism might have good 

■ chance to be adopted in practice. 

~C , Categories and Subject Descriptors: G.2 [Mathematics of Computing]: Discrete Mathematics; G.3 

I I I [Mathematics of Computing]: Probability and Statistics; F.2.0 [Analysis of Algorithms and Prob- 

' lem Complexity]: General; J. 4 [Social and Behavioral Sciences]: Economics 

" ' ■ General Terms: Economics, Theory 

*vj , Additional Key Words and Phrases: query incentive networks; query incentive mechanisms; sybil-proof 

ff^ • mechanisms; branching processes 

^: 

t<. , 1. INTRODUCTION 

^^ I Many information systems, e.g., peer-to-peer networks or social networks, are designed such. 

f^ . tliat queries are answered by networked agents instead of a centralized authority. In such 

CO ' a system, the query propagates in the network with the hope that it will eventually reach 

some agents which hold (and return) an answer. For such query models, it is important to 
design an incentive mechanism to encourage the query propagation and the return of the 
answer. In addition, we would like the mechanism to be efficient, with low expected cost 
for the root, and sybil-proof discouraging the agents to disrupt or delay the query process 
^j ■ by producing fake identities. In this paper, we will present a family of mechanisms which 

can achieve these two goals simultaneously. 

We mainly follow the query incentive network model invented by 



X 



c^ 



Kleinberg and Raghavan [2005 . Under the model, each agent is represented as a 
node in a fixed inhnite rf-ary tree. A query is issued by the root, where each node in 
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the tree may have an answer with a fixed probability p. Information, query or answer, 
can propagate along the edges in the tree which are "turned on" according to a random 
branching process. Each node, with a local view of the tree, can decide if it continues to 
propagate the query to its children and to forward back an answer to its parent. The nodes 
are self-interested and risk-neutral so they choose the actions that maximize their expected 
payoff. In the model, there is a fixed unit cost in forwarding a selected answer along each 
edge. On the other hand, it is free to propagate the queries. 

In [Kleinberg and Raghavan 2005| , the authors considered incentive mechanisms in the 
form of fixed-payment contract, where each node offers a fixed amount of reward, which is 
in turn a part of the reward promised by its parent, to its children, under the condition 
that the child propagates back an answer (and accepted by the root). In the paper, the 
authors obtain the lowest possible cost (or reward) needed for such mechanisms to retrieve 
an answer with a constant probability. The cost depends on the rarity of the answer, defined 
as n — l/p, and the branching factor 6, the expected number of edges to children that are 
"turned on" by the random process. The paper showed a phase transition phenomenon at 
6 = 2: when 6 > 2, the mechanism can achieve low cost of O(logn); when b < 2, the cost 
explodes to n^ ^^\ which is exponential to the number of levels needed to explore. 

The work of Kleinberg and Raghavan [2005] motivated many subsequent works. In par- 



ticular, Arcaute et al. [2007 extended the results t o more general rand om processes, and 



observed the similar phase transition phenomenon. Cebrian et al. [2012| analyzed a differ 



ent type of mechanism, called split contract, which models a successful scheme used by the 
winner [Pickard et al. 20TT| of the DARPA Network Challenge (also known as Red Balloon 
Challenge) [DARPA 2009] . Roughly speaking, in the split contract mechanism, the answer 
holder receives the specified reward, while each "r eferral" on the path to the root receives a 
fraction of its child's reward. [Cebrian et al. [20"T2] showed that the split contract can achieve 
low cost even when b < 2. However, unlike the fixed-payment contract scheme, the split 
contract is not sybil-proof, as one can produce fake identities in the tree to obtain higher 
expected payoff. 

In our paper, we propose a new family of mechanisms which distribute most reward to, 
in addition to the agent who provides the answer, the direct referral, i.e., its parent. We call 
such query incentive mechanisms as Direct Referral (DR) mechanisms. We show that the 
direct referral mechanism, when designed properly, can discourage sybils, i.e., it is not to the 
agents' interest to create fake identities, as well as obtain a low cost to retrieve an answer 
for any b > 1. In particular, the scheme still has a low cost when the success probability is 
close to 1 — C, e.g., with probability 1 — C ^ 0{\/n), where C is the extinction probability 
of the branching process. Both fixed-payment contracts and split contracts incur high cost 
when the success probability approaches 1 — C- 

The following two theorems summarize the main results. 

Theorem 1.1. If the underlying branching process is a deterministic chain, there exists 
a sybil-proof direct referral incentive mechanism with expected cost 0{nh^), where h is the 
desired level of agents the root wants to propagate the query and n is the answer rarity. 

Furthermore, the direct referral mechanism is optimal on the chain among all the sybil-proof 
query incentive mechanisms which satisfy some mild assumptions (Section [ 



Theorem 1.2. For any branching process with branching factor b > 1, there is a 
constant ph > such that for any answer rarity n with < 1/n < ph, there exists a sybil- 
proof direct referral query incentive mechanism with expected cost 0{h^), where h is the 
desired level of agents the root wants to propagate the query. 

Notice that the above bound holds for 6 > 1 so the direct referral mechanism, com- 
pared to the fixed-payment contract mechanism, can achieve low cost for a larger family of 
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branching processes. In addition, tire cost (when 6 > 1) has a polynomial dependence on 
the level h instead of the rarity n. Therefore, in the case of retrieving an answer with high 
probability, i.e., a probability which is close to the extinction probability of the branching 
process, the expected cost is still polynomial in h, rather than polynomial in n (in this 
case, n is exponential in h). In contrast, query incentive networks with either fixed-payment 
contracts |Arcaute et al. 2007| or split contracts |Cebrian et al. 2012] have cost polynomial 
of n in the high probability case. 

The direct referral mechanism has the natural structure of rewarding the answer holder 
and the direct referral while the others only receive minimum compensation for routing 
the answer. Such simplicity might be highly desirable for practical adoption. On the other 
hand, despite the simplicity of the direct referral scheme, we show that the mechanism can 
be quite robust, i.e., sybil-proof, and efficient. Actually, in the case of the infinite chain, we 
can show that such scheme is optimal among all the sybil-proof mechanisms with some mild 
assumptions. 

Intuitively, by rewarding the direct referral, the DR mechanism encourages any agent 
who does not have an answer to propagate the query as there is a chance that one of its 
children may have an answer (and get selected by the root) so the agent can win the direct 
referral reward. In addition, for any agent, no matter how many sybils it creates, at most 
one of thein receives the "direct referral" reward. Therefore, if we design the direct referral 
rewards such that they decrease rapidly enough as the depth increases, the potential gain of 
the sybils may be offset by the gap between the direct referral rewards in the two different 
levels. Of course, the gap cannot be too large for it would increase the expected cost for 
the root. Indeed, we will show that with mildly decreasing direct referral rewards, we can 
achieve both sybil-proofness and low cost at the same time. 

1.1. Related Work 

The model of query incentive networks was introduced by Kleinberg and Raghavan |2005j , 
where they considered a simple branching process in an underlying d-ary tree, i.e., each 
edge exists with an independent probability ^ with branching factor 6 > 1. Kleinberg 
and Raghavan observed an interesting phase-transition phenomenon with fixed-payment 
contracts. Specifically, when 6 > 2, in order to retrieve the answer with constant probability, 
the reward needed for the root to offer is 0(log n) which is asymptotically optimal. However, 
when 1 < 6 < 2, the cost grows to n^^-^\ i.e., the root needs to pay a reward that is 
e xponentially larger th an the expected distance for finding an answer in this case. 



Arcaute et al. [2007 generalized the simple branching process 

in [Kleinberg and Raghavan 2005] to an arbitrary GW branching process in which 
the number of children of a node is determined by a fixed offspring distribution. They 
observed that the phase transition phenomenon at branching factor b = 2 still exists in 
this general case. Furthermore, they also observed that when it requires to find the answer 
with high probability, e.g., with probability 1 — (^ — —, where ^ is the extinction probability 
of the branching process, the phase transition phenomenon vanishes. In particular, for any 
branching process with branching factor b > 1, to retrieve an answer with high probability, 
the required reward is n^^^' . They also showed that in a deterministic chain (6 = 1), the 
cost of the root is n{n\) for finding an answer with constant probability. 

Kota and Narahari |2010| analyzed the reward for such fixed-payment contracts 
when the degree distribution follows power-law. Dikshit and Yadati |2009j considered 
the quality of the answers in query incentive networks. Both models exhibit similar 
phase transition phenomena at branching factor 6 = 2 found in [Arcaute et al. 20071 
[Kleinberg and Raghavan 2005| . 



Cebrian et al. [2012 presented a split contract based mechanism motivated by the success 
of the winning team of the DARPA challenge. In this mechanism, the root provides a reward 
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r to the answer holder. A "shortest path" based answer selection scheme is adopted in case of 
multiple reachable answers. See Section [2.3.1l Each node in the path to the selected answer 
will receive a fraction of the reward received by its child. With the split contract mechanism, 
it is shown that the phase transition phenomenon vanishes at 5 = 2. In particular, for any 
GW branching process with branching factor 6 > 1, the cost to retrieve an answer with 
constant probability is O(logn) in a Nash equilibrium which is asymptotically optimal. 
Therefore, split contract based query incentive networks are more efficient. On the other 
hand, if we want to retrieve an answer with high probability, split contracts also need reward 
ofn^d). 

In the previous studies on the query incentive networks, generating fake identities is not 
part of the agents' strategy. In other words, sybil-proofness is not explicitly explored. In fact, 
we show that the fixed contract based mechanisms are "sybil-proof" while split contract 
mechanisms are clearly not. 

How to prev ent sybil attacks IDouceur 2002] has been studied in many aspects of com- 



puter networks. Babaioff et al. [2012| presented a sybil-proof scheme for the Bitcoin system. 
There are several major ditterences between the Bitcoin system and a query incentive net- 
work. Most notably, instead of a branching process, the network in the Bitcoin system can 
be intentionally constructed, which gives the mechanism designer an additional freedom 
to address the sybil-proofness. Therefore, the results in [Babaioff et al. 2012] cannot be di- 
rectly applied to a query incentive network. They adopted the iterated removal of dominated 
strategy, which is a stronger solution concept than the Nash equilibrium used in this paper. 
Sybil-proofness mechanisms for multi-level marketing is studied in jEmek et al. 2011] 
IDrucker and Fleischer 20T2] . In the multi- level marketing, there is a fixed cost (price) 
for a sybil to purchase the product. Therefore, to enforce sybil-proofness, the mecha- 
nisms try to cap the referral fees. Douceur and Moscibroda [Douceur and Moscibroda 2007] 
gave a sybil-proof lottery tree mechanism for motivating people to install and run a 
distributed service in a peer-to-peer system. The issue of sybil-attacks have also ap- 
peared in many other contexts such as reputation mechanisms [Cheng and Friedman 2005| , 
combinatorial auctions [Todo et al. 2009] . social choice [Wagman and Conitzer 2008 



IConitzer and Yokoo 2010] . and cost-sh aring games IPenna et al. 2009] . One maior differ 



ence between our problem (as well as Babaioff et al. [2012 ) with these problems is they 
are dealing with static configurations, which make sybil-proofness hard to achieve. In our 
results, we punish sybils by reducing their probability of winning in a probabilistic envi- 
ronment. In particular, our mechanism is not sybil-proof if the agents know the outcome of 
the environment. Such trade-off for sybils, i.e., more reward conditioned on winning with 
smaller winning probability, is not available with static inputs. Nevertheless, we believe the 
results in this paper may be of interest in other settings. 

2. QUERY INCENTIVE NETWORKS: A NORMAL FORM 

In this section, we describe the problem and the model formally. In particular, we provide 
formal definition for the random branching process for information propagation and the 
various components that constitute an incentive mechanism. 

2.1. The branching process 

Following previous works, the underlying network is generated by a Galton- Watson (GW) 
branching process on an infinite d-ary tree. In the branching process, each node v samples 
its number of children C{v) independently according to a given distribution D. Afterwards, 
V selects C(v) children, out of its d children uniformly at random, and connects to them. 
The final tree Tr in the branching process is the connected component containing the root. 
We call an agent u active ii u ^ Tr and non-active otherwise. 
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Formally, given an offspring distribution D = {ci}^^g, where X]i=o *"'' ~ ^ ^"^'^ Ci > is the 
probability to have i children, define the probability generating function of the branching 
process as 

d 
^{x)=Y.^.x\ (1) 

4=0 

A basic parameter for the branching process is the branching factor b — X]i=o *''»' defined 
as the expectation of D. The extinction probability C of the branching process is the prob- 
ability that the branching process dies out, i.e., the final tree Tj. is finite. A well known fact 
is that C = 1 if and only if 6 < 1 or 6 := 1 with cq > 0, and < C < 1 otherwise. 

For the query issued from the root, each node in the underlying tree has an answer 
with an independent probability p = 1/n, where n represents the rarity of the answer. So in 
expectation, we should reach out Q(n) nodes to obtain an answer with constant probability. 

2.2. Query model 

Given the above process for generating the network, the query process works as follows: 

(1) The root announces the incentive mechanism, which stipulates rules for selecting the 
answer and for rewarding the involved agents. 

(2) The query is propagated, starting from the root, down the tree as generated by the 
above branching process. 

(3) Each node (agent) in the tree, when receiving the query, may decide whether to continue 
to propagate the query, and whether, in case it has an answer, to report the answer 
back to its parent, 

(4) When an agent has an answer and/or receives reports of answers from its subtree, it 
chooses to report a subset (can be empty) of the answers to its parent. After the root 
receives all reports, a winning answer is selected. 

(5) The holder of the winning answer forwards the answer to the root through the path in 
the tree. If any node on the path decides not to forward, no payment will be made. 

(6) Once the root receives the answer, the rewards are paid to the nodes according to the 
rule announced in Step (1). 

One significant difference between our model and the previous work is that in our model 
the incentive mechanism is determined by the root but the previous work allows a mixture of 
global and local contracts. For example, in the fixed-payment contract or the split contract 
schemes, the type of contracts as well as the process for selecting an answer are fixed globally, 
but each node can decide locally how to enter into a contract with its parent or children. 

In our model, the incentive mechanism announced by the root contains a global reward 
allocation scheme, which maps any final configuration to a set of rewards to the agents. 
While such a global reward scheme may seem limiting as it takes away the freedom enjoyed 
by the nodes in the fixed-payment contract and split contract mechanisms, it is still non- 
trivial to address the new challenge of sybil attacks. On the other hand, we show later in 
this section that it is possible to describe the equilibrium of a local contract-based scheme 
by a global reward allocation scheme. 

2.3. Query incentive mechanism 

We call an answer (or the agent who has the answer) reachable from the root if all the 
agents on the path from the root to the answer are active in the branching process and 
decide to propagate the query. The query incentive mechanism determines how an answer is 
selected when there are multiple reachable answers, and how the rewards are allocated. To 
prevent arbitrarily complex mechanisms, we focus on mechanisms that satisfy the following 
properties. 
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(1) Complete: If there exist reachable answers, the mechanism will select one. 

(2) Unique: Only one answer is selected if multiple answers are presented. 

(3) Anonymous: Agents are not treated preferentially a prior. 

Such properties have been implicit in the previous work. In addition, by requiring these 
properties, it makes the analysis easier. For example, the anonymity is helpful in addressing 
sybil-proofness since the identities of the agents can be manipulated. We now describe 
families of mechanisms that achieve the above properties. We divide the query incentive 
mechanism into two steps: the answer selection step and the reward allocation step. 

2.3.1. Answer selection. The answer selection step chooses one answer when multiple answers 
are reachable from the root. We consider two answer selection schemes, both appeared in 
the literature, that satisfy the above properties. 

— Random Walk (RW): In RW scheme, starting from the root, at each step, we select 
one child uniformaly at random from those children who have reported the existence of 
answers in its subtree. We continue the random walk until we reach an answer, which is 
selected. 

— Shortest Path (SP): Among all the reachable answers, we exclude those that are not the 
closest to the root. We then perform the above RW process for the remaining answers. 

Both schemes have very natural interpretation. In the RW scheme, after the formation 
of the network, each node that has an answer reports its answer to its parent. If a node, 
who does not have an answer itself, receives reports of multiple answers from all children, it 
randomly selects one and reports it to its parent. In particular, if one node has an answer, 
it will not report answers from its subtree. The process continues until the root selects one 
of the answers reported to it uniform at random. 

The SP scheme on the other hand can be viewed as the RW scheme for impatient agents. 
One node reports back to its parent as soon as it receives the query, in the case it has 
an answer. When the node does not have an answer, it immediately reports back when it 
receives reports of answers. In case of multiple answers reported simultaneously, it will select 
one uniform at random. However, unlike RW scheme, the node will not wait the responses 
from all children. We will use the SP answer selection scheme in our mechanismsQ 

2.3.2. Reward allocation. Once an answer is selected, the reward allocation step determines 
how the rewards are assigned to the nodes on the answer path P, i.e., the path from the 
root to the selected answer. To achieve anonymity, we require the reward allocation scheme 
to be oblivious. 

Definition 2.1. Oblivious Reward Allocation Scheme. A reward allocation scheme 
maps any particular answer path P in a resulting branching process T^ to a set of payments 
to the agents in P: 

/:T„PeT, ^[l,oo]l^l. 

The reward scheme / is oblivious if VP G Tr,P' £ T^ such that |P| = |P'|, we have 

/(t„p) = /(t;,p')- 

For a node u G P, we denote its reward as fu{P)- 

In particular, an oblivious reward allocation scheme only cares about the length of the answer 
path. It does not consider the identities of the agents in the path and the structure of the 
resulting branching process. Although this is a restriction on the space of reward allocations. 



^ The reporting strategy described above is only for interpretation. In a Nash equilibrium, we assume an 
agent will report all answers it is aware of. The tie-breaking is handled by the root after receiving all reports. 
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oblivious reward allocation schemes are convenient in case of sybil-attacks, since both the 
identities and the structure of the branching process can be manipulated unexpected. In 
fact, we will show that the equilibria studied in the fixed-payment contract and split contract 
based query incentive networks imply oblivious reward allocation schemes. 

Remark: In our model, we assume there is no cost for the agents. To avoid trivial reward 
allocations, we require all rewards arc normalized to be at least one. This is slightly different 
with previous literature on query incentive networks where a fixed unit cost is assumed in 
forwarding the final selected answer. We choose the reward normalization instead of the 
forwarding cost to avoid defining the answer forwarding cost for sybils. Nevertheless, it is 
straightforward to generalize our results to the unit forwarding cost case as long as the cost 
for sybils is properly defined. 

For a query incentive mechanism with an oblivious reward allocation scheme, the expected 
cost of the mechanism is defined as 



^p[J2fuiP)i 



ueP 

where P is the answer path selected and /(■) is the reward allocation scheme. If there is 
no reachable answer, P is empty. The expectation is taken over the randomness of the 
branching process, the answer distribution and the answer selection scheme. 

2.4. Sybil attack 

Once the incentive mechanism is announced, an agent which is reachable from the root can 
choose the action to maximize its (expected) payoff. The agent can choose to propagate the 
query or not. When it has answers cither from itself or reported from its children, it can 
choose to report a (possibly empty) subset of them to its parent. One important action we 
consider is the creation of fake identities (or sybil attack). We allow the following type of 
sybil attacks. 

— Tree augmentation with sybils. One agent is allowed to generate a possibly infinite 
tree of sybils attaching to its parent. Its original children can be attached to one particular 
sybil in the treeo The reward of the agent is the total rewards that are received by all 
its sybils. 

— Answer placement. If the agent have an answer, it is allowed to place its answer to 
any subset of the sybils. 

— Decision timeframe. We always assume an agent knows whether it has an answer or 
not before its strategic decision. We can also assume the action taken by the agent is 
conditioned on the event that it is active in the branching process. 

With the above definition of sybil-attack, we call a query incentive mechanism is sybil- 
proof if the strategy profile in which no agent generates any sybil is a Nash equilibrium. 

Definition 2.2 (Sybil-proof query incentive mechanisms). A query incentive mechanism, 
consisting of an answer selection scheme and a reward allocation scheme, is sybil-proof with 
level h if the following strategy profile is a Nash equilibrium: 

— All agents in the underlying d-aray tree in level 1 to /t, — 1 : If the agent contains an answer, 
it directly reports back to its parent and does not propagate the query. Otherwise, the 
agent chooses to propagate the query to its children. After it receives reports of answers 
from the children, it reports all answers to its parent. It chooses to forward the answer 
to its parent if one answer is selected from its subtree. 



2 



It would be interesting to consider the case that different children can be attached to different sybils in 
the tree. Our analysis unfortunately cannot handle this case. 
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— The agents at level h: If the agent has an answer, it directly reports back to its parent. 
If the answer is selected, it chooses to forward the answer to its parent. The agent will 
not propagate the query further. 

Although we assume the decision of the agent is conditioned on the event that it is active 
in the branching process, such condition is in fact not necessary, i.e., the agent should take 
the same strategy independent of whether it is active. This can be easily seen since if the 
agent is not active, all strategies will have utility zero. In what follows, we will analyze our 
mechanisms without conditioning on the activeness of the agents. 

2.5. Formulating contract-based mechanisms by global reward allocation schemes 

To illustrate the connection between the global reward allocation scheme studied in the 
paper and the previous work, we describe how one would formulate an equilibrium of a 
contract-based mechanism as a global reward allocation scheme. We will mainly discuss the 
case for the fixed-payment contracts. The case for the split contracts is similar. 

In a fixed-payment contract query incentive network, the root provides a total reward tq 
for an answer of the query. In particular, the root enters a contract with its child u, i.e., if 
one answer from u is selected, the root will pay vq to u. Then query is propagated down 
the tree, during which each node v determines the reward to its children. Finally, if there 
are multiple answers reported, the root will select an answer using the RW answer selection 
scheme. The agents on the path from the root to the selected answer holder are offered the 
reward based on the fixed-payment contracts determined. In particular, the utility of an 
agent v in the path is r^ — r^ — 1, where r^ is the reward offered by its parent and r.^ is the 
reward v offered to its children. If v is holding the selected answer, r,^ = 0. Notice that all 
agents in the path except the root will pay a unit cost to forward the answer. 

The strategy of each participating node ii is a function fv{-), i-C, if the offer from its 
parent u is r„, it will offer a reward of ry = fv{ru) to its children. Then an equilibrium 
in the query incentive network is defined by the set of functions {fv} for all participating 
nodes. (For simplicity, we omit the discussion on the strategy that the nodes decide to 
participate in this section.) 

For any given equilibrium in the fixed-payment contract query incentive network, we can 
construct a reward allocation scheme as follows. Let P = {vi, V2, ■ ■ ■ , Vi} be the sequence of 
nodes in the path from the root (excluded) to the selected answer holder. We will reward 
node Vi by amount of /i,._i (/ui_2 (' ' ' /«i ('')))j where r is the total reward offered by the root. 
(It i = 1, the first node receives a reward of r.) Clearly, this is a valid reward allocation 
scheme. Furthermore, if the equilibrium is symmetric, i.e., all nodes in the i-th level play 
the same strategy /*(•), the corresponding reward allocation scheme is in fact oblivious. 

For an equilibrium in the split contract query incentive network, we can construct such 
a reward allocation scheme as well. However, it is clear that such an equilibrium does not 
lead to a sybil-proof query incentive mechanism. Consider the case with a single chain with 
branching factor 6=1. The first agent who has the answer can create a sybil and sign a 
split contract with ratio with it. In this way, the agent gets all the initial reward, which 
is much more than its fair share in the equilibrium. 

For query incentive networks with fixed-payment contracts, it is shown that there exits 
a best-interest Nash equilibrium Cebrian et al. 2012] with a unique strategy function /(•) 



for all participating agents ^Klcinberg and Raghavan 2005| . Notice that if one agent has an 
answer, it will report truthfully and pocket all the reward offered because creating sybils 
will not increase the total reward in this case while it could reduce the probability that its 
answer is selected in either RW or SP schemes. In fact, we have the following result. 

Theorem 2.3. The best-interest Nash equilibrium in the fixed-payment contract query 
incentive network with the RW answer selection scheme defines a sybil-proof query incentive 
mechanism. 
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Proof. Let the best interest Nash equihbrium in the fixed-payment contracts be {/(•)} 
witli initial reward vq. We have shown that if one agent has the answer, it has no incentive to 
attack with sybils. It is sufficient to consider agents that do not have answers. Now assume 
the reward allocation scheme defined by /(•) is not sybil-proof, i.e., one agent at level i has 
incentive to attack with k additional sybils between itself and its children in the normal 
form. 

Correspondingly, consider the original game with fixed-payment contracts. We show that 
the same agent at level i would benefit by not following /(•) in the equilibrium. In particular, 
in receiving offer ri_i, the agent will offer its children r^+fc = /('^+^^(r,;_i) instead of r^ = 
/(fj-i), where /''^'*'^-'(-) is the function to iteratively apply /(■) for fc -|- 1 times. 

In both cases, i.e., the normal form with k sybils and the fixed-payment contract game 
with offer /(''+^)(ri_i), all the (real) descendants will react as if they are lower in the tree. 
Notice that the two trees in the two cases are slightly different. However, the RW answer 
selection scheme will not be impacted by the sybils. Therefore, for the attacking agent, the 
probabilities that it is on the answer path are the same in both cases. 

Let pi and p2 be the probability that the agent is on the answer path if it does not 
attack and does attack with k sybils respectively. With answer selection scheme RW, we 
have pi > P2- By the assumption of the non-sybil-proofness in the normal form without 
forwarding cost, we have 

P2 ■ {ri-i - n+k) > Pi ■ {n-i - n). (2) 

Notice that in the fixed-payment contract game, the expected utility if this agent plays 
truthfully is pi ■ (ri_i — ri — 1). The expected utility if this agent plays r^+fe is p2 ■ {ri-i — 
Ti+k — !)• Since p2 < Pi, by Eqn. ([2]), we have 

P2 ■ {ri-i - n+k - 1) > Pi • in-1 - n) - P2 > pi ■ {n-i - r^ - 1). 

In other words, /(•) is not an equilibrium of the game with fixed-payment contracts. It is a 
contraction, n 

Remark: We showed the sybil-proofness with the RW answer selection scheme. Our proof 
does not directly apply to the SP answer selection scheme, which we leave as an interesting 
problem. 

Although fixed-payment contracts already imply a sybil-proof query incentive mechanism, 
it is not cost-effective for the case of 6 < 2 |Kleinberg and Raghavan 2005 or if one wants 



to find an answer with probability 1 — ( — - [Arcaute et al. 2007] . In the rest of the paper 
we will propose a new mechanism that is more cost-effective in these two cases. 

3. TECHNICAL PREPARATIONS 

One main complexity of our analysis comes from the branching process. In this section, we 
summarize and develop some technical results regarding the branching process which will 
be needed in the analysis. 

Define (j)i as the probability that there is no answer in the nodes of the first i levels of 
the branching process. Then the probability that the first answer in our branching process 
is at level i > 1 is A^ = 4>i-i — (pi, with (J)q ~ 1. 

Some crucial properties of the sequence of {Xi} are summarized in Lemma 13.11 One 
essential property is that {Xi} is single-peaked: it first increases approximately geometrically 
with a constant ratio. Then it stays at nearly maximum value for a constant number of levels 
until it starts to decrease geometrically. This property might be of independent interest. 

Lemma 3.1. Consider a branching process with branching factor b > I. There exist 
levels 1 < £^ < i* such that 
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(1) {Xi} is a single-peaked sequence, peaking at level i* , i.e., \/i < i* , Xi-i < Xi and\fj > i* , 
Xj > Aj+i . 

(2) There exists constant p > 1, such that yi < £^ , A^+i > p ■ Xi. 

(3) Xii = 0(1) and consequently £* - i'^ ^ 0(1). 

(4) There exists constant 7 > 1, such that Vi > £* + 2, '^^^j Xj < 7 • A^. 

Following the analysis in the literature, we fix the branching factor & as a constant but 
we allow n to grow. 
Define function 



i(x)=^c,x^(l-l)^ 

Notice that (pi = i(0i_i) = t(t(^,;_2)) = t'*''(l). This is because, if the root has j children, 
then the event that there is no answer at the first i levels (with probability (pi) is the same 
as the event that none of its j children has the answer (with probability (1 — — )■') and none 

of the i — 1 level subtrees rooted at its children has the answer (with probability (pl_i). The 
following result studies the growth rate of Xi. 

Proposition 3.2. For all i>l, 

A, 



A, 



ie[i'(0O,i'(0.-i)]- (3) 



Proof. Notice that (/)(i) = t{(/)i_i). Therefore, A^+i = (pi — 0i+i = t{(pi_i) — t{(pi). 
As t(-) is continuous, by mean value theorem, there exists y £ [(pi^(pi-i\ such that A^+i = 
t'{y){(pi^i~(pi) = t'{y)-Xi. Finally, the result comes by the fact that t'{x) is a monotonically 
increasing function for a; > 0. O 

We will also utilize the following two results from previous works. 

Lemma 3.3 ( [Arcaute et AL. 2007J ). Given constant e > 0, Vx e [1 - e, 1], we have 
i'(x)e[(l~i)-6-(l-5ed),(l-i)-&]. 

Lemma 3.4 ( |Cebrian et AL. 2012J ). Consider any GW branching process with 
branching factor b > 1. Then, for every i such that C, < (pi <1, it holds that 

— <max{- -,- -•- -rjjj-r}, 4) 

where "^{x) is the generation function of the branching process and C, is the extinction 
probability. 

Now we are ready to prove the main result in this section. 

Proof of Lemma |3. II When & > 1, notice that {(pi} is a strictly decreasing sequence 
while t'{x) is increasing for x > 0. Proposition 13. 21 indicates that the ratio of A^+i/Ai is de- 
creasing, which implies that {Xi} is a single-peaked sequence. This proves (1) of Lemma l3.1l 

Now we proceed to the second property. Let e G (0, 1) be some constant we will specify 
later. Define i{e) = max{i \ (pi > 1 — e} . 

By Proposition [221 we have -j-^ > t'{(pi). Let 

p =(!--)■ 6- (1 - 5ed) > 1, (5) 

n 



which holds for sufficiently small e. By Lemma [3.31 and the definition of i{e), we have for 



any 1 < i < i{e), ^ > t'{(pi) > p. Therefore, property (2) holds by setting t = £{e) - 1. 
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For the third property, we show that Xe(e)+i = ^(1) foi' some carcfuUy chosen e. Since 
-Jr^ < i'(l) < b, Xpi > A£(j)_|.i/6^. Let C, be the extinction probability of tlie brandling 
process. Assume 

< e < 1 - C- (6) 

By definition of £(e), we have (l>i(^f) > 1 — e and 0f(e)+i < 1 — e. Define c(e) = max{^^, ^37—- 
i-qi'(c) }- Notice that c(e) is non-decreasing for e G (0, 1 — C)- By Lemma \JM we have 

^^P^<c(l-<^,(,))<c(e). (7) 

Since A<?(£)+i = ^^(g) - (?!)£(£)+i, we have 

. 1 - 0f(e) _ 1 ~ 4'e{e} + l — Xi(e) + 1 ^ - ^^(e)-|-l 

c(e) c(e) c(e) 

Therefore, Af(g)+i > ^.^^ , -^ = ri(l), as botli e and c(e) are constants independent of n. In 

other words, for any constant e satisfies Eqn. ([5]) and Eqn. (JH), we can set f^ = £(e) — 1 
and both property (2) and (3) hold. Since A^ is growing from i^ to i* . we must have 
t -t ^ 0(1). (The total probability is at most 1.) 

Finally, we consider property (4) regarding the sequence of {A^} after I* . By the definition 
of t and Proposition 13.21 

t'(<^r^-i)<^<t'(<^r)<^<L (8) 

If we can assume t'{(j)i') is a constant less than 1, we are done. However, it is not clear 
whether t'((f>i*) is bounded away from 1 or not. (For example, if t'(0£.) = 1 — 1/n, we 
cannot directly conclude on property (4). ) We consider two cases: 

(1) YT^ < 1/2. This case is trivial, since for i > £* + 1, A^+i < Ai/2 by the monotonicity 
of i'(-) and Proposition 13.21 Therefore, property (4) holds with 7 = 2. 

(2) ^^ > 1/2. This case implies Af.+2 > Xe'+i/2 > Xg./A = 0(1). Therefore, (f>e.+i > 

Xi.+2=^\l). 

Notice that for any constant x £ (0,1), t"{x) > x'^J2i>2'^i = ^(1)- (^ > 1 implies 
^i>2 ^i -^ ^■) Hence, in case (2), we have t" ((/)£. +1) = fl{l). 

Because t'(-) is continuous, by the mean value theorem, there exists y S [(/)£*-|-i, (j)^*], such 
that 

*'((/.£.) - <'(0,.+i) = t"{y) ■ (0,. - 0,.+i) > i"(0,.+i) • A,.+i = n{l). 

Now since t' {(/)(*) < 1 by Eqn. ([S]), we can conclude 7' = t' ((/)£. +1) is bounded away from 
1, i.e., 7' is a constant smaller than 1. Then for any i > £* + 2, we have A^+i < 7' ■ A^. By 
setting 7 = -r^-r, we prove property (4). D 

4. OPTIMAL SYBIL-PROOF DR MECHANISM ON CHAINS 

In this section, we discuss the case that the underlying tree is simply an infinite chain. Each 
node in the chain has an answer with an independent probability 1/n. We design a direct 
referral mechanism which is sybil-proof in this case. We also show that, in fact, the DR 
mechanism is optimal, up to some mild assumptions. 
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4.1. The direct referral reward scheme 

Let h be the level of the chain that we want to propagate the query to. Notice that any 
oblivious reward scheme can be written as a function r(i, s) for i > 1, s > and i + s < h, 
where r[i, s) is the reward to the i-th agent on an answer path with length i + s < h, i.e. 
the first answer appears at level i + s. Since we discuss DR mechanisms, we have r(i, s) = 1 
for any s ^ {0, 1} and i + s < h. 

Let p = 1/n be the probability that one agent has an answer. Define Ri ~ 
^g^^ r(i, s)p(l — pY~^, i.e., Ri is the expected reward of the i-th agent conditioned on 
the event that the first i agents do not have any answer. Let Pi = J2)^iP{l —pV'^ be the 
probability that there is an answer in i consecutive nodes. We consider the following reward 
allocation scheme in the direct referral mechanism. 

Definition 4.1. DR reward allocation scheme on chains. Define r{i,s) as: 

if i < /i- 1 As = 1, 

otherwise. 
Notice that Ri = p ■ r{i, 1) + (1 — p)P/i_i_i. Therefore, by definition of r{i, 1), we have 

R, = R^+l + Pk-^-l. (10) 

For what follows, we show that several properties of the direct referral mechanism (a) it is 
sybil-proof; (b) the expected cost is 0{h?n); and (c) it is the optimal sybil-proof racchanism, 
i.e., with the optimal cost. 

To show the sybil-proofness, we only need to prove that the following strategy profile is a 
Nash equilibrium: for each node v with distance less than h from the root, if v does not have 
an answer, the strategy of v is to propagate the query and does not create sybils; if it has 
an answer, the strategy of v is to report the answer without creating sybils. For the node 
at distance h, the node will return the answer if it has one; otherwise, it will not propagate 
further and will not create sybils. 

Lemma 4.2. The DR mechanism with above defined reward allocation scheme is sybil- 
proof on chains. 

Proof. For the node v with distance smaller than h from the root, if it does not hold 
an answer, the expected reward for propagating the query is always larger than 0. Thus, it 
will choose to propagate. If v has an answer, v does not need to propagate the query, since 
it can fake any possible path from itself to a node. Furthermore, since the DR scheme only 
rewards nodes of the first h levels, the node at depth h from the root will not propagate 
the query. 

Based on the above discussion, for the sybil-proofness, we only need to rule out the 
strategy that one node create sybils. For the node at distance h from the root, it is not 
beneficial to generate fake nodes as the scheme only rewards the first h nodes in the chain. 
So it is sufficient to consider the nodes with distance less than h to the root. 

Consider an agent v with distance i < hto the root. We first assume that v does not have 
an answer. Denote D{i, k) as the expected reward of i; if w duplicates k < h — i additional 
sybils conditioned on the event that no node from the root to v has an answer. Then we 
have 

D{i, k) = Ri+k + k ■ Ph-i-k = Ri+k-i - Ph-i-k + k ■ Ph-i-k 
< R,+k-i + {k-l)- Ph-,-k+i = D{i, k - 1). 
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The second equality is by Eqn. ((T0| . One can tlien inductively show D{i,k) < D{i,0). In 
other words, node v will not benefit by generating sybils if it does not have an answer. 

Next we consider the case that v has an answer and all previous nodes do not, i.e., v is 
the first answer holder. Assume that v generates k sybils, where k < h — i. Since the DR 
scheme only rewards the first answer, i; will place the answer at the last sybil. (Otherwise, 
it may just generate less sybils.) Then the reward v gets in this case is 

fc-i fe-i 

y^ r{i + s,k- s) + r{i + A:, 0) < ^ r(i + s, 1) + r{i + k, 0) 

fc-1 /i-l /i-l 

= ^r(z + s,l)+ Y. r(s,l) + l = ^r(<,l) + l = r(z,0) 

s— s— z+fc i— i 

The inequality comes from the fact that r(i, 1) > 1. By the above inequality, v gets the 
highest reward without duplication. Combining all above together, we have shown that the 
DR mechanism is sybil-proof, n 

4.2. Efficiency of the DR mechanism 

Next we upper bound the expected total reward of the DR mechanism on chains. We first 
compute the values of r(i, 1) in the following lemma. 

Proposition 4.3. For i < h- 1, r{i, 1) < nhPh + 1, and r{i, 0) < nh'^Ph + h. 

Proof. By Eqn. (HU]), Ri < Ri+i + Ph, which implies Ri < hPh- By definition, r{i, 1) = 
nRi+i + Ph-i-i < nhPh + 1. The result on r(i, 0) directly follows. D 

Lemma 4.4. The expected cost of the DR scheme is Q{P^nh^). 

Proof. Wc first consider the upper bound. Clearly, the cost is dominated by the r{i, 0)s, 
which is at most Ph ■ ma.Xi{r{i,0)} = 0{Phnh^). 

Now consider the lower bound. By Eqn. (fTO|). for i < h/A, R^ — X]/=i^ Pj > j ■ Ph/2- 

By definition of r(-), for i<h/A- 1, r(i, 1) > nRi+i > ^ ■ Ph/2- Then for all i < h/8, 

^(*jO) ^ J^jLi ''(jj 1) > f • ^Ph/2- Therefore, the expected cost is at least 

Tin 
Ph/8 ■ min {r(i,0)} > -—Ph/s ■ Ph/2. 

t<h/S oZ 

The lower bound is attained since Ph/2 > -P/i/2 and Ph/s ^ Ph/8. D 

We have the main result of this section. 

Theorem 11.11 (restated) // the underlying branching process is a deterministic chain, 
there exists a sybil-proof direct referral incentive mechanism with expected cost Oinh^), 
where h is the desired level of agents the root wants to propagate the query and n is the 
answer rarity. 

4.3. Optimality of the DR mechanism on chains 

In fact, our DR mechanism is optimal with respect to a general class of query incentive 
mechanisms on chains. To proceed with our discussion, we focus on the query incentive 
mechanisms with following properties: 

(1) The answer selection is deterministic on the chain. (Notice that both RW and SP are 
deterministic and identical on a chain.) 

(2) The mechanism is sybil-proof. 

(3) The mechanism must retrieve an answer if there is one in the first h agents. 



Proceedings Article 



(4) The reward allocation scheme is normalized, i.e., the reward to each node from the root 
to the first answer must be at least 1. 

We call a query incentive network is regular if all four constraints are satisfied. Notice that 
our DR mechanism is regular. Our DR mechanism is in fact has the smallest cost in all 
resulting configurations among all regular query incentive mechanisms. We provide a proof 
in Appendix \K\ 

Theorem 4.5. The DR mechanism on chains is an optimal regular query incentive 
mechanism. 

5. SYBIL-PROOF DR MECHANISMS ON ARBITRARY BRANCHING PROCESSES 

In this section, we present a sybil-proof query incentive mechanism for an arbitrary branch- 
ing processes with branching factor & > 1. In particular, we present a Direct Referral (DR) 
mechanism for any interested height h. We will show that the DR mechanism, which uses 
the SP answer selection scheme, is sybil-proof. After that, we show that the expected cost 
for the DR scheme is 0{h^). It is desirable to point out that compared with previous works, 
our analysis does not depend on a particular height h, and it does not require 6 > 2. 

5.1. Direct referral query incentive mechanisms with height h 

The DR mechanism will be using the SP answer selection scheme. Similar to the chain case, 
the reward allocation scheme in the DR mechanism is oblivious. In particular, the reward 
allocation scheme can be described by a function r(-, •), where r(i, s) is the reward of the 
i-th agent in the selected answer path and the selected answer holder is the i + s-th agent 
for J > 1 and s > 0. 

To simply the notation, we decompose the referral reward (with s > 1) r(i, s) into two 
parts: r(i,s) = x{i,s) + y{i,s). x{i^s) > is for the direct referral and y{i,s) > ensures 
that the reward is at least 1. Specifically, the referral rewards in our the DR mechanism for 
height h with i < h — 1 and s > 1 is defined as r(i, s) — x{i, s) + y{i, s): 

r j-i "" ] 

max < x(j, 1) + > Af>, ii i < h — 1 and s = 1, , , 




i=i+l 



otherwise. 



and 



,. , (l if J -I- s < ft. and s > 1, ,,„, 

ylt.s) = < , , . (12) 

^^ ' ' \0, otherwise. ^ ' 

Finally, the reward of the selected answer holder at level i is defined ai — r[i,0) ioi i < h 
as follows, aji ~ 1 and for i < h, 

a, = x{i, 1) + flj+i + l^a, = {h-i + l)+ Y^ x{j, 1). (13) 

i<j<h 

Informally, in the above reward scheme, if the selected answer is at level i < h, the answer 
holder receives a^, its direct referral at level i — 1 receives Xi-i + 1 and all other agents from 
the root to the selected answer holder receive 1. For simplicity, we abbreviate the sequence 
{x{i, 1)} as {xi}. 

Remark: Our mechanism requires a specified height h. Furthermore, to compute the 
rewards, we need the knowledge of the branching process. Notice that, such knowledge is 
also required in previous literature to compute the initial reward as well as the contracts 
between the agents. 
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5.2. Sybil-proofness of the DR query incentive mechanism 

We show that the DR mechanism defined above is sybil-proof. Eqn. p^ implies that if one 
agent at level i has an answer, it does not have incentive to generate sybils, i.e., a{i) is its 
largest possible reward and generating sybils will not increase its chance to be selected. So 
to show the sybil-proofncss of the DR mechanism, we only need to argue for those agents 
who do not have answers. The following lemma characterizes the probabilities that one 
agent is rewarded when it docs not have an answer. 

Proposition 5.1. For an agent v at level i < h in the underlying d-ary tree of the 
branching process, let Rcv(f ) he the event that v is on the path from the root to the selected 
answer holder u^v and DR('y) he the event that u is a child of v. We have 

Pr[Rev(i;)] = ^='+^ ' and Pr[DR(i;)] = -j±i. 
Let NA(t;) he the event that v does not have an answer. 

Pr[Rev(w) I NA(w)] = ^^ • ^^'^^ ' and Pr[DR(t;) | NA(f )] = ^^ • -J±i. 

Proof. Rev(w) happens if the selected answer u is at level i + 1 or lower, which is with 
probability X]7=i+i ^j ^^ ^^'^ mechanism only retrieves an answer in the first h levels. Notice 
that there are exactly d* agents at level i. (The root is treated as level 0.) By symmetry, the 
probability that u is in the subtree oiv is exactly ^. Therefore, Pr[Rev(w)] = J2i=i+i ^j/d^- 
By the same argument, we have Pr[DR(?;)] = Ai+l/d^ 

Now consider the two probabilities conditioned on the event NA(w) with Pr[NA(ii)] = 
(1 — 1/n). Notice that Rcv(w) (resp. DR(u)) implies NA(?j). In other words, Pr[Rcv(w) A 
NA(u)] == Pr[Rev(w)] (resp. Pr[DR(w) A NA(u)] = Pr[DR(w)]). The results then come from 
Bayes' theorem. D 

Lemma 5.2. The above DR mechanism is syhil-proof. 

Proof. As discussed earlier, it is sufficient to consider agents that do not have answers. 
Consider such an agent v at level i < h — 1. Suppose v will get more reward by generating 
j sybils and attach its original subtree to the last sybil. 

Consider the case that v does not create sybils. Conditioning on the even NA(w), i.e. 
V does not have an answer, by Proposition 15.11 the probability that v receives the direct 
referral fee is -^ • :i^j and the probability that the answer selected is in the subtree rooted 

at V is Ej=«+i ^ ■ 7^- 

Notice that if v generates sybils, both probabilities will decrease. This can be shown by 
coupling the randomness in both cases. For any fixed realization of the branching process 
and the answer placement, the probability that a particular answer in w's subtree is selected 
is always higher (or equal) in the case that v does not generate sybils by the SP answer 
selection scheme. 

Now we consider the rewards in both cases. If v docs not generate sybils, the expected 
reward is 

rO - ^^ . ( ^ . X +^^^^^^\ (U) 

"^'^ n-1 \ d^ ""'^ d^ ■ ^'^> 
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The first part is the reward from direct referral Xi and the second part conies from y{i,s). 
For the case that v generates j >\ sybils, the expected reward of v and its sybils is at most 

By Eqn. ([TT]) . xi > Xi+j + j ■ — "y^^ ' i which implies R^ > Rj. This is a contradiction. 
Therefore, agent v will not benefit by generating sybils if it does not have the answer. Hence 
the DR mechanism is sybil-proof. D 

5.3. The expected cost of the DR mechanism 

Now we start analyzing the expected cost of the DR mechanism, which can be described as 
follows 

h h h 

^Xi -Qi+^Xi- Xi-i + ^ Ai • (i - 1) (16) 

The first term is the reward to the answer holder. The second term is the reward for the 
direct referral and the third term is for all other agents that forwarded the answer. Before 
we analyze the cost in Eqn. (jl6|) . we characterize the sequence {cc,}. 

Observation 5.1. The sequence of {xi} is decreasing. 

Proposition 5.3. Fori > I* + 1, Xi < ^-{h — i), where 7 is the constant in Lemma WTX 

Proof. We prove the statement by induction. The statement holds for i = h as Xh = 0. 
Suppose it holds for all i > j- Now consider j — 1 > i* + 1. By construction. 



-J + 1) 



h 



Xj^i = mayiixe + ( V AJ < max{7 ■ (h ~ £) +j ■ {£ - j + 1)} = j ■ {h -j + 1). 



e>3 \j ^^ e>j 

J S=] 

The inequality is by Lemma [XT] (4). D 

In other words, we will not pay a lot of referral fee for agents beyond level t* . 

Proposition 5.4. For all i < i*, we have Aj+i • Xj < (7 + 1) • (/i - i). 

Proof. The statement is proved by induction. We first consider the case i ^ £* . By 
definition in Eqn. ([TT|). 

xe> = max {x(j) + l-^ J2 ^^} ^ .?]?^x Ix(.7) + 0' - t) ■ (1 + Z-^illi)} 

< max {7 • {h-j) + (.7 - t) ■ (1 +7)} < (7 + 1) • (/i - t). (17) 

The first inequality is by Lemma l3.1l f4') and the second inequality is by Lemma |3. II fl) and 
Proposition 15.31 

Now suppose it holds for all j < i < £* . Consider j — 1 > 1, we have 

h 

Xj ■ Xj-i — max{Aj ■ x^ + {i - j + I) ^ A^.} < max{Aj ■ x^ + {i — j + 1)}. (18) 

Assume the term maxf>j{AjXf + (^ ~ J + 1)} is optimized at i ~ j* . Consider two cases: 

Case 1: j* > t. In this case, A^ • Xj, + {j* - j + 1) < (7 + 1) • (/i - j*) + [j* - j + 1) < 
(7 + 1) • (/i - j + 1) by Proposition O and Eqn. dTT]). 
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Case 2: j < j* < £* . We have 
A, • xy. + if- - J + 1) < Xr+ix,, + {j* -j + l)<h + l)-{h- f) + if - J + 1) 

<i-f + l)-ih-J + l). 

The first inequahty comes from Xj < Xj^j^i by Lemma |3. II (1). The second inequahty is by 
induction. Therefore, in both cases, we have Xj ■ Xj-i < (7 + 1)(^ — j + !)• D 

We are ready to show the expected cost of the above mechanism. 

Lemma 5.5. The expected reward of the DR query incentive mechanism is 0{h^). 

Proof. By Proposition 15 .41 and Proposition 15.31 the total referral fee is 

h r 

^A,-x,_i<^(7 + l)(/i-*)+7-(/i-^*-l)- Y. ^j^O{h^). (19) 

i=2 1=2 i>^* + l 

Now we analyze the total expected reward for answer holders. 

h h h h j 

2J X^■ ai = 2J X^■ {h- i + l + 2_] ^3 ) <h + 2_] ^j 2_j ^^ 

i^l i^l j—i J = l i—1 

e^-1 3 t h 

<'^+E^j£^» + E^J+ Y. ^'J' (20) 

j=l i=l j=P j=t'+l 

where i^ is defined in Lemma [01 We inspect the terms individually. Consider the second 
term in Eqn. ([^(7)) . By Lemma I5TT] (2), {Xi] grows at a rate of p > 1 until A^i for (.^ < i*. 

Therefore, for any i < €\ J2]^^ Aj < ^A,. Then 

j=l i=\ j=l ' 3 = 1 ^ 

The second inequality is by the fact A^ < A^+i for i < £* — 1 in Lemma lOl fl) and the last 
equality comes from Eqn. p^ . 

For the third term, by ProDOsition l5.4l for any i^ < j < i* , 

X, < (7 + l){h - j) . -^. < (7 + l){h -j)-^ = Oik). 

The last equality comes from A^. > Xj > A^i = i^(l) by Lemma 13.11 (3). Also by 
Lemma [nU (3), £* - i'^ = 0(1) and the third term in Eqn. 1^ is 0{h). 

Finally, by Proposition 15. 3[ the last term in Eqn. pO| is 0{h^). We conclude that 

h 

J2X^■a,^0{h^). (21) 

i=l 

Combining Eqn. (ITOl) and Eqn. (PTj) . the expected cost of Eqn. ([TCI) is 0{h^). D 

We obtain the main result of this paper. 

Theorem I1.2I (restated) For any branching process with branching factor b > \, there 
is a constant pf, > such that for any answer rarity n with < 1/n < pt, there exists a 
sybil-proof direct referral query incentive mechanism with expected cost 0{h^), where h is 
the desired level of agents the root wants to propagate the query. 
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6. CONCLUSION 

In this paper, we developed a set of sybil-proof query incentive mechanisms which allocate 
most rewards to the selected answer holder as well as its direct referral. We show that such 
a direct referral mechanism, with properly designed rewards, is efficient, with expected cost 
0{h^) for any 6 > 1. In contrast, all previous mechanisms require a reward exponential to h 
when h is large enough for finding an answer with high probability. We also showed that if 
the underlying branching process is a chain, our DR mechanism is optimal with some mild 
assumptions. Since our reward scheme has very simple structure, it might have good chance 
to be adopted in practice. 

One drawback of our DR mechanism is that it is only efficient in expectation. In some 
cases, e.g., when the first agent has an answer, the cost will be exponential to h even in 
the case of 6 > 2 and h = O(logri). As a result, our mechanism is not collusion-proof. 
Agents can collude with each other to improve their overall reward. It is an interesting open 
problem to design efficient sybil-proof incentive mechanisms to address these issues. 
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A. THE OPTIMALITY OF THE DR MECHANISM ON CHAINS 

We first characterize suclr regular sybil-proof meclianisms. 

Lemma A.l. On a chain, if a regular query incentive mechanism is sybil-proof, its 
reward scheme will only reward the first answer holder and its ancestors. 

Proof. Consider any final answer configuration of a chain with length h. If the reward 
scheme will reward agents beyond the first answer holder, in our sybil-attacking model, the 
first answer holder can produce the remaining chain by its sybils and collect all remaining 
rewards. Notice that the agents can observe the results of its ancestors. Therefore, such 
reward scheme is not sybil-proof. D 

Since any regular sybil-proof mechanism only rewards the first answer, we can use the 
reward framework given in Section [4.11 In particular, we define r(i, s) as the reward to the 
i-th agent on an answer path with length i + s < h, i.e., the first answer appears at level 



Proof of Theorem 14.51 For simplicity, we define a{i) — r(i,0). Consider another reg- 
ular mechanism defines reward f{i,s) to the i-th node if the selected answer holder is at the 
i -\- s position and reward a{j) to the answer holder if the answer is at the j-th position. In 
order to show the optimality of our mechanisin, we only need to prove that f{i, s) > r{i, s) 
and a{j) > a{j). 

We first show that f{i, s) > r{i, s). Note that we only need to prove this for i < ft, — 1 and 
3 = 1, since in other cases the inequality is obviously correct. 

Let R{i, k) = X]s=i P(l ^ pY^^ 12i=o ^(^ '^ iiS ^ ^ ^ j): which is the expected reward 
that the node of depth i can get by creating k additional sybils conditioned on the event 
that all previous nodes including itself do not have any answer. We inductively show that 
^(*5 s) ^ f{ii s) for i < ft — 1 and s = 1. This is clearly true for i == ft — 1. Assume that this 
is true for i + 1. For i, since the other mechanism is sybil-proof in a Nash equilibrium, we 
have 

h—i—l h — i—1 



R{i,Q)>R{i,l)^ J2 Pil-pr-'rihs + l)+ J2 Pii-pVH^ + i.s) 

h — i—l h — i—1 

> ^ p{l~py-^f{t,s + l)+ J2 P(l-P)'~'K* + I,s) (22) 

= J2 p{l-pyf■{^:S + l)+R^+l. 



The inequality in Eqn. (|22p is by induction. So we have 

h—i h—i—1 

^(z,0)=p-f(z,l) + ^p(l-p)^-V(j,s)> Y. Pa-pr-'f{t,s + l) + R,+i (23) 
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Rearranging the terms in tlie above inequality, we have 

/i — ?' — 1 h—i 

h — i—1 h — i — 1 

= Y, p{i-py-^f{t,s + i)- J2 pil-pyHhs + l) + R^+l 

s=l s=l 

h-i-1 

= P- Y. P{^-pY~^r{i,s + l) + R,+i 

s=l 

> p ■ Ph-i-i + Ri+1 

= p-r{i,l) 

Thus, f{i,l) >r(i,l). 

Next we will inductively show that a{j) > a(j). For j = h, a(j) > 1 = o,{j). Assume that 
o-ij) ^ fl(i) for all j > i + 1, then for j = i, hy the sybil-proofncss of the other mechanism, 
we have 

a{i) > f{i, 1) + a{i + 1) > r{i, 1) + a{i + 1) = a{i) (24) 

Combining all together, we have shown the optimality of our DR mechanism. D 



